Implicit push data transfer

ABSTRACT

A computer-implemented method comprises intercepting, by a first programmatic container of a first computing device, a system call made by a second programmatic container to an operating system of the first computing device. This example method also includes, in response to intercepting the system call, generating, by the first programmatic container, an enriched message based at least in part on the intercepted system call and a metrics message sent from the second programmatic container to an interface of the first computer. Further, this example method includes sending the enriched message to a monitoring application hosted on a second computer.

RELATED APPLICATION DATA AND CLAIM OF PRIORITY

This application is a Continuation of prior U.S. patent application Ser.No. 15/170,290, filed Jun. 1, 2016, which claims the benefit under 35U.S.C. § 119(e) of provisional application 62/169,542, filed Jun. 1,2015, the entire contents of each of which are hereby incorporated byreference for all purposes as if fully set forth herein.

TECHNICAL FIELD

The present disclosure generally relates to inter-process datacommunications in containerized computer systems. The disclosure relatesmore specifically to communicating data between a first process within afirst container and a second process within a second container withoutthe need for a local collector process.

BACKGROUND

The approaches described in this section are approaches that could bepursued, but not necessarily approaches that have been previouslyconceived or pursued. Therefore, unless otherwise indicated, it shouldnot be assumed that any of the approaches described in this sectionqualify as prior art merely by virtue of their inclusion in thissection.

Managing computer program applications running on networked computingdevices typically involves some aspect of monitoring the applications.Monitoring can involve collecting application messages and other datatraffic that the applications emit toward a network, directed at peerinstances of the applications, directed at servers, or directed atclient computing devices. The open source software project “statsd” (orSTATSD) has emerged as a popular means of collecting application trafficand aggregating the traffic for analysis. The “statsd” software isorganized as a daemon that can perform statistics aggregation and isavailable at the time of this writing in the Github repository systemvia the repository name etsy/statsd.

Containerization has emerged as a popular alternative to virtual machineinstances for developing computer program applications. Withcontainerization, computer program code can be developed once and thenpackaged in a container that is portable to different platforms that arecapable of managing and running the containers. Consequently,containerization permits faster software development for the sameprogram for multiple different platforms that would otherwise requireseparate source branches or forks, or at least different compilation andexecution environments. However, containerization also can imposeconstraints on inter-program communications.

SUMMARY

The appended claims may serve as a summary of the disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings:

FIG. 1 illustrates an example system consisting of a virtual machine, acloud agent, and a plurality of applications according to the currentstate of the art.

FIG. 2 illustrates an example system consisting of a plurality ofcontainers and a monitoring application for communicating data between afirst process within a first container and a second process within asecond container without the need for a local collector processaccording to one embodiment.

FIG. 3 illustrates a process for communicating data between a firstprocess within a first container and a second process within a secondcontainer without the need for a local collector process according toone embodiment.

FIG. 4 illustrates a computer system upon which an embodiment of theinvention may be implemented according to one embodiment.

While each of the drawing figures illustrates a particular embodimentfor purposes of illustrating a clear example, other embodiments mayomit, add to, reorder, or modify any of the elements shown in thedrawing figures. For purposes of illustrating clear examples, one ormore figures may be described with reference to one or more otherfigures, but using the particular arrangement illustrated in the one ormore other figures is not required in other embodiments. For example,container 210, container 212, container 214 in FIG. 2 may be describedwith reference to several steps in FIG. 3 and discussed in detail below,but using the particular arrangement illustrated in FIG. 2 is notrequired in other embodiments.

DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be apparent, however,that the present invention may be practiced without these specificdetails. In other instances, well-known structures and devices are shownin block diagram form in order to avoid unnecessarily obscuring thepresent invention. Furthermore, words, such as “or,” may be inclusive orexclusive unless expressly stated otherwise.

Embodiments are described herein according to the following outline:

1.0 General Overview 2.0 Example System for Communicating Data between aFirst Process within a First Container and a Second Process within aSecond Container 3.0 Example System for Communicating Data between aFirst Process within a First Container and a Second Process within aSecond Container without the Need for a Local Collector Process 4.0Process for Communicating Data between a First Process within a FirstContainer and a Second Process within a Second Container without theNeed for a Local Collector Process 5.0 Selected Benefits of Embodiments6.0 Implementation Mechanisms—Hardware Overview 7.0 Other Aspects ofDisclosure

1.0 General Overview

Systems and methods are discussed herein for communicating data betweena first process within a first container and a second process within asecond container without the need for a local collector process. In oneembodiment, a computer implemented method for communicating data betweena first process within a first container and a second process within asecond container without the need for a local collector processcomprises executing, in a first container of a first computer system,input source instructions; executing, using the same first computersystem, a plurality of containerized application programs in differentcorresponding containers; monitoring, by the input source instructions,the one or more different containerized application programs byidentifying one or more system calls that resulted from the differentcontainer applications generating statistical messages relating tooperation of the containerized application programs; generating, by theinput source instructions, one or more enriched messages based on thesystem calls that were identified and based on the statistical messages,transmitting the one or more enriched messages to a first metriccollector, and aggregating a plurality of the enriched messages into aset of aggregated metrics values; sending, from the first metriccollector to a monitoring application that is hosted on a secondcomputer system, the aggregated metrics values.

In another embodiment, a computer-implemented method comprisesexecuting, in a first container of a first computer system, input sourceinstructions; executing, using the same first computer system, aplurality of containerized application programs in differentcorresponding containers; monitoring, by the input source instructions,the one or more different containerized application programs byidentifying one or more system calls that resulted from the differentcontainer applications generating statistical messages relating tooperation of the containerized application programs and communicatingthe statistical messages to a “localhost” interface of the firstcomputer system, wherein each of the one or more system calls is one of:read, write, send, sendto, recv, recvfrom, sendmsg, sendmmsg, recvmsg,recvmmsg, pread, pwrite, ready, writev, preadv, pwritev, sendfile;generating, by the input source instructions, one 60351-0056-4 or moreenriched messages based on the system calls that were identified andbased on the statistical messages by adding one or more of a containername tag, an application ID tag, and an image name tag to thestatistical messages; transmitting the one or more enriched messages toa first metric collector; aggregating a plurality of the enrichedmessages into a set of aggregated metrics values; sending, from thefirst metric collector to a monitoring application that is hosted on asecond computer system, the aggregated metrics values.

In another embodiment, a computer system comprises a first programmaticcontainer that contains an application program that is programmed tosend a plurality of application metrics messages to a localhostinterface of the computer system and to cause generating one or moresystem calls each time that one of the application metrics messages issent; a second programmatic container, logically separate from the firstprogrammatic container, that is programmed to host a set of input sourceinstructions and a collector module; wherein the input sourceinstructions are programmed to listen for the one or more system callsand, in response to detecting a particular system call, to obtain aparticular application metrics message that is associated with theparticular system call, to tag the particular application metricsmessage with one or more tag values and to send the particularapplication metrics message with the tag values to the collector module.

In some approaches, techniques to aggregate and summarize applicationmetrics consists of a metric collector that resides on a differentmachine and aggregates traffic from all metric sources. Typically, themetric collector is listening for any metrics sent to it from theapplications it is monitoring in what is called active collection ofmetrics. The metric collector is reachable through a static IP addressor an ad hoc DNS entry.

However, this become cumbersome as each metric update must travel acrossthe network to the metric collector, which imposes a tradeoff betweenthe frequency of metric updates and the network bandwidth that isconsumed. As a result, in situations where conserving network bandwidthis preferred, fewer metric updates are available than desired.Additionally, these metrics may travel separately from metrics gatheredunder different metric systems but corresponding to the same applicationor container thus decreasing the opportunity to compress and efficientlytransmit performance data. Finally, it is not possible to tag andenhance the metrics with context data for successive segmentation,because information is lost about which container, host or applicationgenerated the metric.

In other approaches, which attempts to deal with these limitations, eachcontainer hosts a local metric collector. Each local metric collectoraggregates different types of metrics from different metric systems intosamples that are sent to a general purpose monitoring backend at regularintervals. These aggregated metrics sample messages then travel acrossthe network to reach a monitoring backend program. While this approachis more efficient than the prior one, particularly with biggerdeployments, due to the fact that metrics are aggregated and compressedbefore they are sent to the monitoring backend, it runs into manylimitations in containerized systems. The addition of a metriccollection agent to every container is inefficient, complicatesdeployments, and does not adhere to the container philosophy of havingone process per container.

To address this inefficient duplication of metric collection agents inevery container, other approaches, place a metric collector on the samemachine as the containers but in its own monitoring container. Themonitoring container is configured for collecting system metrics,stitching everything together and sending samples to a general-purposebackend at regular intervals. While this solves the problem of duplicatemetric collectors, the applications in each container must be configuredwith target locations to which the applications should send the metrics.This mechanism is quite rudimentary and pretty fragile. For example, itmakes it hard to update the monitoring container, because each updatewill almost certainly change the IP address of the monitoring containerand destroy the linking. Another approach is assigning a static IP tothe monitoring container. This has all the limitations involved withusing static IP addresses, including possible address conflicts if amonitoring container is needed on each physical host.

2.0 Example System for Communicating Data Between a First Process withina First Container and a Second Process within a Second Container.

FIG. 1 illustrates an example computer system that is configured toperform monitoring of application metrics using either active or passivecollection.

In the example of FIG. 1, one or more applications 170 and a virtualmachine 160, which may comprise a JAVA virtual machine or other types ofvirtual machines, are hosted and execute in user space 100 under controlof an operating system. One or more applications or apps 162 executeunder control of the virtual machine 160. For purposes of illustrating aclear example, FIG. 1 shows the apps as JAVA apps, but other embodimentsmay be used with JAVASCRIPT, PYTHON, PHP, RUBY, GO, and others. Othercomputing resources such as network 120, memory 130, CPU 140, andfilesystem 150 are hosted or execute in kernel space 190, which isisolated from user space 100 by operating system operations.

In an embodiment, a monitoring cloud agent 110 is communicativelycoupled to the virtual machine 160 via management library 164 and poller114, and communicatively coupled to application 170 via metric library175 and metric collector 116. In an embodiment, the monitoring cloudagent 110 also comprises a watchdog process 112 and agent process 118.The monitoring cloud agent 110 is communicatively coupled to theresources in kernel space 190. Metric library 175 is communicativelycoupled to external collector 180.

In the example system of FIG. 1, metrics are collected both actively andpassively. In active collection 102, the monitoring cloud agent 110receives metrics from applications 170 via metric library 175, whichdictate how to transmit the communications such that the metriccollector 116 receives it and understands how to interpret it.Additionally, in passive collection 104, agent process 118 automaticallyintercepts communication between metric library 175 and externalcollector 180.

In an embodiment, the monitoring cloud agent 110 comprises an embeddedmetrics server process, such as a STATSD server, which has beenprogrammed or configured to send custom metrics to a collector andrelayed to a back-end database system for aggregation. Applications candefine specific metrics, and those custom metrics plus standard metricsthat are pre-programmed can be visualized in the same graphicalinterface. For purposes of illustrating a clear implementation example,this description focuses on techniques applicable to deployment of theSTATSD statistics aggregation daemon software. However, the techniquesdescribed herein may be used with other systems that are programmedusing push-based protocols not related to aggregation or statistics, anduse with STATSD is not required. For example, the “metrics” library,which is available at the time of this writing in the Github repository“dropwizard”, may be used with the techniques herein.

In an embodiment, with active collection, a collector program listens onport “8125,” which is the standard STATSD port, on TCP and UDP. STATSDis a text-based protocol in which data samples are separated by thecharacter \n. Programming STATSD to send metrics from an application tothe collector can be performed using the following example command:

-   -   echo “hello_statsd:1|c”|nc-u-w0 127.0.0.1 8125        In this example, the counter metric “hello_statsd” is        transmitted with a value of “1” to the netcat process, which        handles the UDP network write operation to the collector on port        “8125”.

In one embodiment, the protocol format is:

<metric_name>:<value>|<type>[@<sampling_ratio>]

Each <metric_name> can be any string except certain reserved characterssuch as “#”. The <value> is a number and depends on the metric type.Sampling ratio is a value between 0 (exclusive) and 1, and is used tohandle sub sampling.

In an embodiment, the metric type indicated by <type> can be any of:counter, histogram, gauge, and set. Other embodiments may implementother forms of metrics. A counter metric is updated with a value that issent by the application, sent to the back-end database, and then resetto zero. An application can use a counter, for example, to count howmany calls have been made to an API. Negative values result indecrementing a counter. A histogram metric may be used, for every samplereceived, to calculate aggregations such as sum, min, max, mean, count,median, and percentiles. Histograms may be used to send metrics such asaccess time, file size, and others. A gauge is a single value that istransmitted “as is”. Relative increments or decrements of a counter canbe achieved by specifying “+” or “−” before a gauge value. A set is likea counter but counts unique elements. As an example, the followingsyntax causes the value of “active_users” to be “2”:active_users:user1|s active_users:user2|s active_users:user1|s. In anembodiment, metrics may be tagged using strings, key-value pairs, andother values.

Turning now to passive collection, in infrastructures already containinga third party STATSD collection server, STATSD metrics can be collected“out of band”. A passive collection technique is automatically performedby the monitoring cloud agent 110 by intercepting system calls. Thismethod does not require changing a current STATSD configuration. Passivecollection is particularly useful for containerized environments inwhich simplicity and efficiency are important. In an embodiment, with acontainerized version of the monitoring cloud agent 110 running on thehost, all other container applications can continue to transmit to anycurrently implemented collector. If no collector is executing, thencontainer applications can be configured to send STATSD metrics to thelocalhost interface (127.0.0.1) as shown in the example command above;there is no requirement for a STATSD server to be listening at thataddress.

In effect, each network transmission made from inside the applicationcontainer, including STATSD messages that are sent to a non-existentdestination, generates a system call. The monitoring cloud agent 110captures these system calls from its own container, where the STATSDcollector is listening. In practice, the monitoring cloud agent 110 actsas a transparent proxy between the application and the STATSD collector,even if they are in different containers. The agent correlates whichcontainer a system call is coming from, and uses that information totransparently tag the STATSD messages.

3.0 Example System for Communicating Data Between a First Process withina First Container and a Second Process within a Second Container withoutthe Need for a Local Collector Process.

FIG. 2 illustrates an example system consisting of a plurality ofcontainers and a monitoring application for communicating data between afirst process within a first container and a second process within asecond container without the need for a local collector processaccording to one embodiment.

In the example of FIG. 2, a computer system 200 hosts or executes aplurality of containers 210, 212, 214, 230. For example, each of thecontainers may be instantiated and managed using the DOCKER®containerization system, commercially available from Docker Inc., SanFrancisco, Calif., or using the LXC containerization system or CoreOScontainers. Each of the container 210, container 212, and container 214respectively contains container application 220, container application222, and container application 224. Three (3) such containers andapplications are shown solely to illustrate a clear example, and otherembodiments may use any number of containers. These may be independentapplications having different functionality, or may be differentinstances of the same application; the applications 220, 222, 224 emitapplication metrics.

Container 230 comprises input source instructions 240, metric collector250, and a database or repository of metrics 260. In an embodiment, theinput source instructions 240 comprise the SYSDIG or “sysdig” cloudagent software that is commercially available from Draios, Inc., Davis,Calif. The metric collector 250 may be implemented as a STATSD agent, asan example. Container 230 further is communicatively coupled using anetwork connection to monitoring application 270 which typically ishosted or executed using a separate machine than the computer 200.Monitoring application 270 may be termed a monitoring back-end and maycomprise persistent data storage, analytics systems, and/or apresentation layer for user interaction.

Instructions 240 may comprise a program that is configured to enrich,aggregate, analyze and report upon metrics that are collected not justvia STATSD, but from any of a plurality of different programs, apps,systems or subsystems that may be distributed throughout a distributedsystem in relation to applications or infrastructure or both. Theinstructions 240 may be programmed to correlate data received from themetric collector 250 with other metrics received across the computingenvironment to result in creating and storing system-application metrics260.

For purposes of illustrating a clear example, monitoring application 270is pictured outside of system 100, however, monitoring application 270can also reside in computer 200 with container 210, container 212,container 214, and container 230. A “computer” may be one or morephysical computers, virtual computers, or computing devices. As anexample, a computer may be one or more server computers, cloud-basedcomputers, cloud-based cluster of computers, virtual machine instancesor virtual machine computing elements such as virtual processors,storage and memory, data centers, storage devices, routers, hubs,switches, desktop computers, laptop computers, mobile devices, or anyother special-purpose computing devices. Any reference to “a computer”herein may mean one or more computers, unless expressly stated otherwiseand any reference to a “router” can mean any element of internetworkinggear. Further, each of the containers 210, 212, 214 may be physicallypresent in a computer that is local to an enterprise or owner oroperator, or located in a shared computing center such as in a cloudcomputing environment.

In this arrangement, the applications within the containers 210, 212,214 send metrics, for example in the form of STATSD messages, to the“localhost” interface. This may be accomplished by programming orconfiguring the STATSD daemon to write to the local address “127.0.0.1”.Otherwise, there is no need to code a collector IP address in the apps,and there is no need to deal with the complications imposed by staticprogramming of an address. Since there is no STATSD collector on thelocalhost interface, the UDP payload of the emitted STATSD messages isdropped in each case, which is illustrated in FIG. 2 by “trashcan”icons. However, the same message automatically appears in the monitoringcontainer, where it is received by instructions 240. In response, theinstructions 240 may enrich the received metrics message with one ormore tags that can be used for segmentation or other downstreamanalysis. Example tags include a container name, application ID andimage name. In one embodiment, the instructions 240 may be programmed toreceive definitions of additional tags that are specified inuser-created configuration data.

Further, in one embodiment, the instructions 240 may be programmed tomerge the metrics messages with other system, network or applicationmetrics that have been generated internally using the instructions 240.The combined metrics may be compressed and then communicated to theback-end system at any suitable rate, such as once per second.

Information about how to set up a “sysdig” cloud agent, as oneimplementation for example of instructions 240, is described indocuments that are available online at the time of this writing in thefiles “204498905-Agent-Installation-Instructions” and“204418585-Container-Deployment,” both in the“/hc/en-us/articles/folders of the domain “support.sysdigcloud.com” onthe internet, and can be retrieved using HTTP.

Lines 280 in FIG. 2 indicate implicit communication paths betweenapplication containers 210, 212, 214 to the monitoring container 230. Toaccomplish transmission of metrics messages, such as STATSD messages,from the application containers to the monitoring container 230, in anembodiment, each network transmission made from inside the applicationcontainers 210, 212, 214, including STATSD messages and including anyother messages sent to a non-existent destination, generate a systemcall inherently via operation of the containerization system. Theinstructions 240 are programmed to capture or listen for such systemcalls, from a separate container 230 that also includes the metricscollector 250, which also is programmed to listen for system calls. Inpractice, the instructions 240 act as a transparent proxy between theapplications in containers 210, 212, 214 and the collector 250, even ifthey are in different containers.

Specific example techniques that can be used to cause the instructionsto detect system calls and respond to the system calls are disclosed inapplication Ser. No. 13/953,970, filed Jul. 30, 2013, US patentpublication 20150039745A1, the entire contents of which are herebyincorporated by reference for all purposes as if fully set forth herein.The reader of the present patent document is assumed to have familiaritywith and understand US patent publication 20150039745A1 for purposes ofimplementing the techniques disclosed herein.

Examples of system calls that a push-based protocol could generate, andthat the instructions 240 could be programmed to listen for, include:read, write, send, sendto, recv, recvfrom, sendmsg, sendmmsg, recvmsg,recvmmsg, pread, pwrite, ready, writev, preadv, pwritev, sendfile. Othersystem calls can be used depending on the operating system family of themachine that hosts the containers, operating system version, andprocessor architecture.

The instructions 240 also are programmed to determine which container aparticular system call is coming from, and the instructions 240 may usethat information to transparently tag the stated message.

4.0 Process for Communicating Data Between a First Process within aFirst Container and a Second Process within a Second Container withoutthe Need for a Local Collector Process.

FIG. 3 illustrates a process when performed on the example system ofFIG. 2 for communicating data between a first process within a firstcontainer and a second process within a second container without theneed for a local collector process according to one embodiment.

In step 310, input source instructions 240 are executed in container 230in computer 200.

In step 320, also in computer 200 a plurality of containerizedapplication programs in different corresponding containers are executed.Here, container application 220, container application 222, containerapplication 224 are executed in container 210, container 212, container214 respectively.

As container 210, container 212, container 214, and container 230 allreside on the same computer. They each will execute system calls inorder to interact with the resources and applications comprisingcomputer 200 in addition to sending statistical messages regarding eachcontainer's performance.

In step 330, input source instructions 240 monitor the one or moredifferent containerized application programs by identifying one or moresystem calls that resulted from different container applicationsgenerating statistical messages relating to operation of thecontainerized application programs.

Here, as explicitly and intentionally omitted in FIG. 2 there is nocommunication between container application 220, container application222, container application 224 and input source instructions 240. In anembodiment, each network transmission is made from inside theapplication containers, including statistical messages and includingones sent to a nonexistent destination, generates a system call. Inputsource instructions 240 monitors for system calls and detects thesesystem calls originating from container 210, container 212, andcontainer 214.

As container 230 resides on the same computer as container 212,container 214, container 216, input source instructions 240 can beconfigured to listen to system calls made by container application 220,container application 222, container application 224 to computer 200.Some examples of system calls that the input source instructions canmonitor include but are not limited to: read, write, send, sendto, recv,recvfrom, sendmsg, snedmmsg, recvmsg, recvmmsg, pread, pwrite, ready,writev, preadv, pwritev, sendfile. Other system calls may be useddepending on the operating system family of the machine that hosts thecontainers, operating system version and processor architecture.

In step 340, input source instructions 240 generates one or moreenriched messages based on the system calls that were identified andbased on the statistical messages.

Here, input source instructions 240 generates enriched messages based onthe system calls that it monitored and the statistical messages sentregarding the performance of the container 210, container 212, container214, and container application 220, container application 222, containerapplication 224. These enriched messages can contain metadata and tagsthat aid in fine-tuning performance. Example tags include but are notlimited to a container name, application ID, and image name.

Additionally, input source instructions 240 can be programmed to pullassociated groupings and hierarchies automatically so that segmentingthe enriched messages by group or by host can be done readily. Forexample if container 210, container 212, container 214, and containerapplication 220, container application 222, container application 224were related to one another by grouping or hierarchy, input sourceinstructions 240 can further segment enriched messages such that metriccollector can better send relevant data, together, to monitoringapplication 270.

Additionally, input source instructions 240 can be programmed to performautomatic correlation of received statistical messages to createenriched messages. These enriched messages can, but are not required to,take the form of system metrics, application metrics, infrastructuremetrics, network metrics, and container metrics.

In step 350, input source instructions 240 transmits the one or moreenriched messages to a first metric collector 250, and aggregates aplurality of the enriched messages into a set of aggregated metricsvalues.

Here, metric collector 250 receives the one or more enriched messagesfrom input source instructions 240 and stores them as metrics 260 inpreparation for sending on to monitoring application 270 in step 360.

In step 360, metric collector 250 sends the aggregate metrics values tomonitoring application 270. In order to limit the amount ofbandwidth-consumption, particularly with large amounts of metrics beingcollected on larger and larger systems with many containers, the metriccollector takes a set of aggregated metrics values and can send them onto the monitoring backend at designated intervals or even in compressedformat.

The instructions 240 also may be programmed to pull in the associatedgroupings and hierarchies of a metrics system automatically, so thatsegmenting the STATSD data by group or by host for example can be done.The instructions 240 further may be programmed to perform automaticcorrelation of received custom application metrics with other metricsfrom across the computing environment in which the containers arerunning. Example metrics that can be correlated include: system (CPU,memory, disk usage); application (JMX, HTTP, status codes);infrastructure (SQL, MongoDB, Redis, Amazon Web Services); network(traffic, connections); containers (DOCKER, COREOS, LXC).

5.0 Selected Benefits of Embodiments

The disclosure has described a low impact high efficiency mechanism tocommunicate data between processes located in different containerswithout the need of local collector process. In one respect, a mechanismto collect metrics from multiple containers without the overhead ofduplicate metric collectors, complex linking, and bandwidth-heavycommunication. Embodiments provides the benefits of local metricscollectors without the drawbacks described above arising fromconventional container integration. For example, one benefit is thatthere is no need to instrument the container in any way. The programmingof apps to “push metrics to localhost” is simple and easy to understand.

Another benefit is that no special network configuration is required;for example, there is no need to deal with DNS or static IP address.Additionally, as the input source instructions monitors system calls,another benefit is that metric collection systems that are alreadyimplemented would not need to be modified or dismantled. Input sourceinstructions would automatically identify system calls associated withmetric communications from the containers and incorporate them.

The approach also provides local aggregation with minimal bandwidthoverhead. The approach can use existing container tagging or hosttagging, and permits aggregation of metrics with the best availablecontainer system without complex programming or adaptation. Containersthat are already running STATSD or another metrics program do notrequire special instrumentation, or a STATSD server in the container,and there is no need for network tuning of bandwidth usage.

The approach disclosed herein also works when the apps are alreadyexporting metrics to an existing collector. The instructions 240 willautomatically capture these exports also, with minimal overhead and nodisruption to the current export. In other words, if a particular usercomputer already has the STATSD project installed and running forexample, then adding the instructions 240 programmed as described hereinwill result in automatically capturing STATSD push metrics messageswithout any special configuration of STATSD. Instead, the instructions240 are programmed to listen for those system calls that are ordinarilygenerated by the conventional operation of a metrics program such asSTATSD, and to obtain the metrics messages that were associated withthose system calls.

6.0 Implementation Mechanisms—Hardware Overview

According to one embodiment, the techniques described herein areimplemented by one or more special-purpose computing devices. Thespecial-purpose computing devices may be hard-wired to perform thetechniques, or may include digital electronic devices such as one ormore application-specific integrated circuits (ASICs) or fieldprogrammable gate arrays (FPGAs) that are persistently programmed toperform the techniques, or may include one or more general purposehardware processors programmed to perform the techniques pursuant toprogram instructions in firmware, memory, other storage, or acombination. Such special-purpose computing devices may also combinecustom hard-wired logic, ASICs, or FPGAs with custom programming toaccomplish the techniques. The special-purpose computing devices may bedesktop computer systems, portable computer systems, handheld devices,networking devices or any other device that incorporates hard-wiredand/or program logic to implement the techniques.

For example, FIG. 4 is a block diagram that illustrates a computersystem 400 upon which an embodiment of the invention may be implemented.Computer system 400 includes a bus 402 or other communication mechanismfor communicating information, and a hardware processor 404 coupled withbus 402 for processing information. Hardware processor 404 may be, forexample, a general purpose microprocessor.

Computer system 400 also includes a main-memory 406, such as a randomaccess memory (RAM) or other dynamic storage device, coupled to bus 402for storing information and instructions to be executed by processor404. Main-memory 406 also may be used for storing temporary variables orother intermediate information during execution of instructions to beexecuted by processor 404. Such instructions, when stored innon-transitory storage media accessible to processor 404, rendercomputer system 400 into a special-purpose machine that is customized toperform the operations specified in the instructions.

Computer system 400 further includes a read only memory (ROM) 408 orother static storage device coupled to bus 402 for storing staticinformation and instructions for processor 404. A storage device 410,such as a magnetic disk or optical disk, is provided and coupled to bus402 for storing information and instructions.

Computer system 400 may be coupled via bus 402 to a display 412, such asa cathode ray tube (CRT), for displaying information to a computer user.An input device 414, including alphanumeric and other keys, is coupledto bus 402 for communicating information and command selections toprocessor 404. Another type of user input device is cursor control 416,such as a mouse, a trackball, or cursor direction keys for communicatingdirection information and command selections to processor 404 and forcontrolling cursor movement on display 412. This input device typicallyhas two degrees of freedom in two axes, a first axis (e.g., x) and asecond axis (e.g., y), that allows the device to specify positions in aplane.

Computer system 400 may implement the techniques described herein usingcustomized hard-wired logic, one or more ASICs or FPGAs, firmware and/orprogram logic which in combination with the computer system causes orprograms computer system 400 to be a special-purpose machine. Accordingto one embodiment, the techniques herein are performed by computersystem 400 in response to processor 404 executing one or more sequencesof one or more instructions contained in main-memory 406. Suchinstructions may be read into main-memory 406 from another storagemedium, such as storage device 410. Execution of the sequences ofinstructions contained in main-memory 406 causes processor 404 toperform the process steps described herein. In alternative embodiments,hard-wired circuitry may be used in place of or in combination withsoftware instructions.

The term “storage media” as used herein refers to any non-transitorymedia that store data and/or instructions that cause a machine tooperation in a specific fashion. Such storage media may comprisenon-volatile media and/or volatile media. Non-volatile media includes,for example, optical or magnetic disks, such as storage device 410.Volatile media includes dynamic memory, such as main-memory 406. Commonforms of storage media include, for example, a floppy disk, a flexibledisk, hard disk, solid state drive, magnetic tape, or any other magneticdata storage medium, a CD-ROM, any other optical data storage medium,any physical medium with patterns of holes, a RAM, a PROM, and EPROM, aFLASH-EPROM, NVRAM, any other memory chip or cartridge.

Storage media is distinct from but may be used in conjunction withtransmission media. Transmission media participates in transferringinformation between storage media. For example, transmission mediaincludes coaxial cables, copper wire and fiber optics, including thewires that comprise bus 402. Transmission media can also take the formof acoustic or light waves, such as those generated during radio-waveand infra-red data communications.

Various forms of media may be involved in carrying one or more sequencesof one or more instructions to processor 404 for execution. For example,the instructions may initially be carried on a magnetic disk or solidstate drive of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 400 canreceive the data on the telephone line and use an infra-red transmitterto convert the data to an infra-red signal. An infra-red detector canreceive the data carried in the infra-red signal and appropriatecircuitry can place the data on bus 402. Bus 402 carries the data tomain-memory 406, from which processor 404 retrieves and executes theinstructions. The instructions received by main-memory 406 mayoptionally be stored on storage device 410 either before or afterexecution by processor 404.

Computer system 400 also includes a communication interface 418 coupledto bus 402. Communication interface 418 provides a two-way datacommunication coupling to a network link 420 that is connected to alocal network 422. For example, communication interface 418 may be anintegrated services digital network (ISDN) card, cable modem, satellitemodem, or a modem to provide a data communication connection to acorresponding type of telephone line. As another example, communicationinterface 418 may be a local area network (LAN) card to provide a datacommunication connection to a compatible LAN. Wireless links may also beimplemented. In any such implementation, communication interface 418sends and receives electrical, electromagnetic or optical signals thatcarry digital data streams representing various types of information.

Network link 420 typically provides data communication through one ormore networks to other data devices. For example, network link 420 mayprovide a connection through local network 422 to a host computer 424 orto data equipment operated by an Internet Service Provider (ISP) 426.ISP 426 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the“Internet” 428. Local network 422 and Internet 428 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link 420and through communication interface 418, which carry the digital data toand from computer system 400, are example forms of transmission media.

Computer system 400 can send messages and receive data, includingprogram code, through the network(s), network link 420 and communicationinterface 418. In the Internet example, a server computer 430 mighttransmit a requested code for an application program through Internet428, ISP 426, local network 422 and communication interface 418.

The received code may be executed by processor 404 as it is received,and/or stored in storage device 410, or other non-volatile storage forlater execution.

7.0 Other Aspects of Disclosure

In the foregoing specification, embodiments of the invention have beendescribed with reference to numerous specific details that may vary fromimplementation to implementation. The specification and drawings are,accordingly, to be regarded in an illustrative rather than a restrictivesense. The sole and exclusive indicator of the scope of the invention,and what is intended by the applicants to be the scope of the invention,is the literal and equivalent scope of the set of claims that issue fromthis application, in the specific form in which such claims issue,including any subsequent correction.

What is claimed is:
 1. A computer-implemented method comprising:intercepting, by a first programmatic container of a first computingdevice, a system call made by a second programmatic container to anoperating system of the first computing device; in response tointercepting the system call, generating, by the first programmaticcontainer, an enriched message based at least in part on the interceptedsystem call and a metrics message sent from the second programmaticcontainer to an interface of the first computer; and sending theenriched message to a monitoring application hosted on a secondcomputer.
 2. The computer-implemented method of claim 1, furthercomprising: sending, from the second programmatic container, the metricsmessage to the interface of the first computing device; and making, bythe second programmatic container, the system call to the operatingsystem of the first computing device.
 3. The computer-implemented methodof claim 1, wherein the first programmatic container performs theintercepting and the generating with no direct communication between thefirst programmatic container and the second programmatic container. 4.The computer-implemented method of claim 1, further comprising:intercepting, by the first programmatic container, a plurality of systemcalls made by the second programmatic container to the operating systemof the first computing device; and in response to intercepting theplurality of system calls, generating, by the first programmaticcontainer, a plurality of enriched messages based at least in part onthe intercepted plurality of system calls and a plurality of metricsmessages sent from the second programmatic container to the interface ofthe first computer.
 5. The computer-implemented method of claim 4,further comprising: aggregating the plurality of enriched messages; andsending the aggregated plurality of enriched messages to the monitoringapplication hosted on the second computer.
 6. The computer-implementedmethod of claim 1, further comprising: intercepting, by the firstprogrammatic container, a plurality of system calls made by a pluralityof second programmatic containers to the operating system of the firstcomputing device; and in response to intercepting the plurality ofsystem call, generating, by the first programmatic container, aplurality of enriched messages based at least in part on the interceptedplurality of system calls and a plurality of metrics messages sent fromthe plurality of second programmatic containers to the interface of thefirst computer.
 7. The computer-implemented method of claim 1, furthercomprising generating, by the first programmatic container, a secondenriched message based at least in part on a second metrics message sentfrom the second programmatic container to the first programmaticcontainer.
 8. The computer-implemented method of claim 1, whereingenerating the enriched message includes adding one or more of acontainer name tag, an application ID tag, or an image name tag to theenriched message associated with the second programmatic container. 9.The computer-implemented method of claim 1, further comprisingmonitoring, by the first programmatic container in response tointercepting the system call, a localhost interface of the firstcomputer for the metrics message sent from the second programmaticcontainer.
 10. The computer-implemented method of claim 9, wherein thelocalhost interface of the first computer is at local IP address127.0.0.1.
 11. A computer system comprising: one or more processors; oneor more memories storing computer-executable instructions that, whenexecuted by the one or more processors, cause the one or more processorsto: intercept, by a first programmatic container of a first computingdevice, a system call made by a second programmatic container to anoperating system of the first computing device; in response tointercepting the system call, generate, by the first programmaticcontainer, an enriched message based at least in part on the interceptedsystem call and a metrics message sent from the second programmaticcontainer to an interface of the first computer; and send the enrichedmessage to a monitoring application hosted on a second computer.
 12. Thecomputer system of claim 11, further comprising the one or more memoriesstoring computer-executable instructions that, when executed by the oneor more processors, cause the one or more processors to: send, from thesecond programmatic container, the metrics message to the interface ofthe first computing device; and make, by the second programmaticcontainer, the system call to the operating system of the firstcomputing device.
 13. The computer system of claim 11, wherein the firstprogrammatic container is configured to intercept the system call and togenerate the enriched message with no direct communication between thefirst programmatic container and the second programmatic container. 14.The computer system of claim 11, further comprising the one or morememories storing computer-executable instructions that, when executed bythe one or more processors, cause the one or more processors to:intercept, by the first programmatic container, a plurality of systemcalls made by the second programmatic container to the operating systemof the first computing device; and in response to intercepting theplurality of system calls, generate, by the first programmaticcontainer, a plurality of enriched messages based at least in part onthe intercepted plurality of system calls and a plurality of metricsmessages sent from the second programmatic container to the interface ofthe first computer.
 15. The computer system of claim 14, furthercomprising the one or more memories storing computer-executableinstructions that, when executed by the one or more processors, causethe one or more processors to: aggregate the plurality of enrichedmessages; and send the aggregated plurality of enriched messages to themonitoring application hosted on the second computer.
 16. The computersystem of claim 11, further comprising the one or more memories storingcomputer-executable instructions that, when executed by the one or moreprocessors, cause the one or more processors to: intercept, by the firstprogrammatic container, a plurality of system calls made by a pluralityof second programmatic containers to the operating system of the firstcomputing device; and in response to intercepting the plurality ofsystem call, generate, by the first programmatic container, a pluralityof enriched messages based at least in part on the intercepted pluralityof system calls and a plurality of metrics messages sent from theplurality of second programmatic containers to the interface of thefirst computer.
 17. The computer system of claim 11, further comprisingthe one or more memories storing computer-executable instructions that,when executed by the one or more processors, cause the one or moreprocessors to generate, by the first programmatic container, a secondenriched message based at least in part on a second metrics message sentfrom the second programmatic container to the first programmaticcontainer.
 18. The computer system of claim 11, wherein firstprogrammatic container is configured to generate the enriched message,at least in part, by adding one or more of a container name tag, anapplication ID tag, or an image name tag to the enriched messageassociated with the second programmatic container.
 19. The computersystem of claim 11, further comprising the one or more memories storingcomputer-executable instructions that, when executed by the one or moreprocessors, cause the one or more processors to monitor, by the firstprogrammatic container in response to intercepting the system call, alocalhost interface of the first computer for the metrics message sentfrom the second programmatic container.
 20. The computer system of claim19, wherein the localhost interface of the first computer is at local IPaddress 127.0.0.1.